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1. Introduction 

The aim of this paper is to study extreme eigenvalues of matrices from deformed 
random matrix ensembles. We will consider both the deformed GOE and the 
spiked population model. 

1.1. Models and some known results 

The Gaussian Orthogonal Ensemble, or GOE for short, is probably the most 
widely studied model in random matrix theory. The deformed GOE is a finite 
rank perturbation of the Gaussian Orthogonal Ensemble. More precisely, let 
G G GOE(n, — ) and P be a real symmetric matrix, we want to study the 
extreme eigenvalues of A = P + G. 

When the dimension goes to infinity, the asymptotic properties of the largest 
eigenvalues of matrices from the deformed GOE has been studied by various 
authors, where the a.e. limit, CLT and large deviation principle arc established. 
Similar results were also obtained in the non Gaussian case. See [9] (a.e. limit 
for Xi(A) } the earliest progress on this problem), [23] (CLT for general Xi(A), 
Gaussian case), [10] (CLT for Xi(A), non Gaussian case), [7] (CLT for gen- 
eral Xi(A), non Gaussian case), [20] (large deviation for Xi(A), rank(P) = 1, 
Gaussian case), [12] (a.e. limit for general Xi(A), unitary invariant case). 

Another model we considered in this paper is the spiked population model, 
first proposed by [15]. Here we have independent samples drawn from a Gaussian 
distribution with covariance matrix E having all but a few eigenvalues equal one. 
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The object under study is the "spiked eigenvalues" of the sample covariancc 
matrix S n . If S = /, then S n is a Wishart matrix. So the spiked population 
model can be considered as a finite rank perturbation of the Wishart matrix 
ensemble. 

This model has also been extensively studied in the literature. The asymptotic 
properties of the largest eigenvalues of S n were established. The ground breaking 
work on this problem is [3], in which the CLT for Xi(S n ) was derived for the 
complex Gaussian case. See also [22] (CLT for A, (<?„), real Gaussian case), [5] 
(a.e. limit for Xi(S n ), non Gaussian case), [16] (CLT for Xi(S n ), non Gaussian 
case), [11] (CLT for Xi(S n ), non Gaussian case). 

1.2. Main results of this paper 

Instead of considering asymptotic properties, this paper established sharp devi- 
ation bounds for the extreme eigenvalues of matrices from the deformed GOE 
and the spiked population model. 

Our result about the deformed GOE is theorem 3.1, in which we proved 



where Xi (A) is the i-th largest eigenvalue of A = P + G, B\ is the i-th largest 
eigenvalue of P, and Xg i is defined as 



Theorem 3.1 assumes that P has only nonnegative eigenvalues. A similar result 
for the smallest eigenvalues of A holds when P has negative eigenvalues. 

Our results about the spiked population model are divided into two parts. 
Theorem 3.2 established deviation bounds for the largest eigenvalues, and the- 
orem 3.3 established deviation bounds for the smallest eigenvalues. We can 
summarize these two theorems as the following. Let 9 2 be an eigenvalue of the 
population covariance matrix S, 9 2 ^ 1. Then the corresponding "spiked eigen- 
value" X(S n ) of the sample covariance matrix will satisfy 



P(|A((^)-VI>*)<Cie 



-C 2 nt 2 /a 2 
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if 9i > er, 
if < 9i < a. 



P(\X(S n ) - Xe, c \ > t) < C x e 



-C 2 nt 2 



where Xg c is defined as 




if 6 2 > 1 + y/c, or c < l,9 2 

if 1< 9 2 < 1 + y/c, 

if c < 1,1 - yfc< 2 < 1. 




Unlike the traditional approach, our method does not involve the use mo- 
ment method, Stieltjes transform, or the joint density formula for eigenvalues. 
Instead, we use the min-max characterization of eigenvalues and concentration 



Minyu Peng /Eigenvalues of Deformed Random Matrices 



3 



of measure for Gaussian processes to prove the upper tail bound for the largest 
eigenvalues, and use explicit construction of eigenvectors to prove the lower tail 
bound. 

In the existing literature, the study of the deformed GOE and the spiked 
population model require completely different techniques, see [23] and [22]. Our 
method has the advantage of treating these two models the same way. Once the 
basic idea is understood, the proof of these three theorems are almost identical. 
See section 4 for an outline of proof. 



2. Notation 



x G W 1 is considered as a column vector, x* is the transpose, Xj is the j-th 
coordinate, |x| = y/Y?i=i x 1 1S tnc Euclidean norm. For x, y G R n , let {x, y) = 
Y^j=i x jVj> x -L V means (x, y) = 0. S™ -1 = {i£ K" : \x\ = 1}. A metric space 
will be written as (X, d) where d is the metric. For example, (5 1 ™ -1 , | • |) is 5" 1-1 
with the Euclidean metric. A metric will be specified whenever we discuss e-nct. 

K px ™ is the set of all p x n real matrices. i s the set of all n x n real 

symmetric matrices. For A G R px ™, |j^4|| is the largest singular value, A* is the 
transpose. E t j is the matrix with 1 on the (i,j) entry and elsewhere, the 
ambient dimension will be clear whenever we use this notation. The Gaussian 
Orthogonal Ensemble is defined as 

2 

GOE(n, — ) ={A e K" x ™ : Oj ,-, 1 < i < j < n, are independent; 
n y 

- Af(0, — ); Oi,j - AT(0, — ), i < j} 
n n 

where N([i,(J 2 ) denotes the Gaussian distribution with mean \x and variance 
a 2 . 

The size of a finite set A will be denoted by \A\. For a, b £ K, a\/b = max{a, b}. 



3. Statement of Main Results 



This section contains our main results. Theorem 3.1 is our result about the 
deformed GOE. Theorem 3.2 and theorem 3.3 are our results about the spiked 
population model. 

Theorem 3.1. Let A = P + G, G E GOE(n, ^-), P g IR™ X ," has rank r with 
eigenvalues 6\ > • • • > 9 r > 0. Let \\{A) > • • • > X„(A) be the eigenvalues of 
A. Define 

X e = l 9+ ^ lf6>a > (3.1) 
[2a if0<9<(j. V ' 

(i): Let Ci, e (n) = • n, C mfi {n) - 2mC M (n)(l + %iM)™-\ m > 2. 

When r > 0, we have 

P(Xi(A) > \g t +t) < 2C r - i+hei (n) ■ e^^ (3.2) 
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for 1 < i < r, t > V 2 ^^ o < S < |. When r = 0, we have 

P(X-l(A) > 2a + t) < e~lB, t>0 (3.3) 

(ii): Let r$ be the number of di larger than a. If tq > 0, then for 1 < i < ro, 
t > 0, we /iaue 

_(n- r )(9 i - Z )f c, ( „_ r) ( 8 ._ g) 5 t 2 

P(Ai(^)<A 9i -i-Ciar/n)<e m "' i0 'i + 8i ■ e (3.4) 

where C\ y G<i are two positive constants. (We can pick C\ = 2,6*2 = 0.25j 

We assumed 9i > for simplicity; theorem 3.1 holds with trivial modification 
when P has both positive and negative eigenvalues. 

Part (ii) of theorem 3.1 provides a lower tail bound for At(A) only when 
9i > a. When 6i < a, we can use the semicircle law to get a lower tail bound for 
Xi(A), this is intuitively clear: the interval [2a — e, 2a] should contain about en 
eigenvalues. See [1] for a rigorous derivation. Our result shows that Xi(A) will 
not exit the semicircle law band when 0i < a. 

Theorem 3.1 essentially says, when r is small, we have Ai(A) sa \g i , 1 < i < r. 
As a consequence, for fixed r, we have \i{A) — > Xg i , n — > oo, and the fluctuation 
of Xi(A) is of order 

We can also allow r to grow with n; for example, if r = o(j^^), we still have 
A, (A) — > Xg i , n — y oo . This can be derived by using our deviation bound and the 
Borel-Cantelli lemma. The a.e. convergence of Xi(A) when r grows like o(j^^) 
can not be derived by existing methods in the literature. 

Theorem 3.2. S € S - diag{6> 2 , • ■ • ,6f +s , 1, • ■ ■ , 1}, 0i > ■ • • > 6 r > 

1 > r +l > ■ ■ > 9r+s > 0. Let G G R pxn entries g^j being i.i.d. W(0, 1). 

Consider the sample covariance matrix S n = I(E* G)(E* G)*, Zei Ai(S„) > 
• • • > A p (S' n ) 6e its eigenvalues. For c > 0,define 

(9 2 +c-J^ T ife 2 > l + yfc, orc< 1,0 2 < 
A e . c = i (1 + V^) 2 K < 1 + v/B, (3.5) 

[(1-V^) 2 ifc< l,l-Vc<0 2 <1. 

(i,): Letc = E ^ L , C 0) fi (n) = 1, Ci,e(n) = 2 ' (v< ^' + ' ) | », C m ,e(n) = 2mC 1 . e (n)(l+ 
Cu( " ) ) ro - 1 ,m> 2. Wiera r > 0. we have 

m— 1 ' ' — ? 

(l-5) 2 7tt 2 



P(V*i(S n ) > VX^~c + t) < C r _ mA (n) • e ""^T" (3.6) 



for 1< i < r. t > v ^±Hi . < 5 < ±. Mien r = 0, we We 

P(VMSn) > 1 + VpT" + t)<e~^, t>0 (3.7) 
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(ii): Let tq be the number of 9 2 larger than 1 + \fc, c = £— ^. If r > 0, then 
for 1 < i < tq, t > 0, we have 

P( VUSn) < V^Z. - t - C&r/n) < C 2 i • e- C3 "* 2 (3.8) 

where C\ , C2 are two positive constants and C3 is a positive real number de- 
pending on 6\ and c. 

Theorem 3.2 describes the relationship between the largest sample eigenval- 
ues and the largest population eigenvalues. Loosely speaking, the population 
eigenvalue 9 2 can be estimated by solving the equation 

6 2 + c-^-=\(S n ) 
a A — 1 

if we observe an sample eigenvalue \{S n ) > (1 + \/c) 2 ; and those population 
eigenvalues < 1 + \J vl n are n °t estimable from the sample covariance matrix. 
We also have Xi(S n ) — > Xg iiC ,n — > 00 when r = o{j^). 

It's worth noting that the bounds we derived in theorem 3.2 does not de- 
pend on population eigenvalues that are smaller than one. This is important 
in applications with heteroscedasticity. Suppose we have n observations on the 
variables R± , ■ ■ ■ , R p and we believe that they are driven by a small number of 
principle components, i.e. 

Ri(t) = Pi,iPi(t) + ■■■ + pi, r P r (t) + ei(t), t = 1, • • • ,n 

Even var(ei) = erf are not equal, we can still estimate the coefficients f3i t k and 
var(Pk) reliably using the sample covariance matrix. We can estimate ^£of ~ 
a 2 and simply pretend that a 2 = a 2 . 

The proof of part (i) of theorem 3.2 was inspired by [13], in which (3.7) was 
proved. [13] does not discuss random matrices explicitly, see [8] for a discussion 
of the results of [13] in terms of random matrices. 

The next theorem is about the smallest eigenvalues of the sample covariance 
matrix in the spiked population model. 

Theorem 3.3. We use the same notation as in theorem 3.2. 

(i): Assume n > p. Let c = 2 ^ L ,c l = p ~ r ^ s ■ Let C-(n) = C r+s - i+ i, SlV i(n) + 
C r ,e 1 wi( n )- When s > 0, we have 

P(/W(SJ < \J Ae r+3 _ i+1 , c ' ~ ^ -t)< C>(n) e - L1 ^^r (3.9) 

for 1< i < s, t > Vf+p+Sg^ l < 8 < i. When s = 0,r > we have 

<l_V7-^-i)< 2C r , 0l {n)e~ (1 $T (3.10) 
fort> ,^ ei ,0 < S < i When s = 0,r = 0, we have 

- ^<5(l-5)n' _ 3 

P(yJ\ p (S n ) < 1 - Vp/n-t) < e"^, t>0 (3.11) 
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(ii): Let so be the number of 9 2 smaller than 1 — sfd , d = ~ s . If Sq > 0, 
then for 1 < i < sq, t > 0, we have 

P(^J X p - i+1 {S n ) < y/\e r+ ,- i+1 ,c> - t - Ci0i(r + s)/n) < C 2 i ■ e~ c ^ e (3.12) 

where C\ , C2 are two positive constants, and C3 is a positive real number de- 
pending on 9i and d . 

4. Outline of Proof 

This section explains the main idea in the r = 1 case of theorem 3.1. 

Consider A = 6E1 1 + G, G £ GOE(n, > 0, our objective is to show 

\x(A)n\ e . 

Let's consider the upper tail bound first. If we can prove -E[Ai(A)] < \g, 
then the concentration of measure for Gaussian processes will yield the desired 
upper tail bound. We begin with \i(A) = max l6 s n -i x* Ax, so Ai(A) is the 
maximum of the Gaussian process {x*Ax : x € and one might consider 

using Slepian's lemma (proposition 5.1) to prove E[Xi(A)} < Xg. However, this 
seems to be a tall order. 

The first key idea is to stratify S"™ -1 using the first coordinate 

Ai(j4) = max L u , L u = max x* Ax 

uG[0,l] x£S n - 1 ,x 1 =u 

Each L u is the maximum of a Gaussian process, and we can use Slepian's lemma 
to prove 

E[L U ] < 6u 2 + 2a\/\ - u 2 = tp(u) 

When 6 > a, the maximum of f{u) is <p(yj 1 — fr) = + when < a, the 
maximum is tp(0) = 2a. Thus E[L U ] < Xe and we can apply concentration of 

measure for Gaussian processes to get P(L U > Xg + t) < e ^ . 

To get an upper tail bound for X\(A) = max u6 [o.i] L u , we have to take a 
union bound. The second key idea is to use an e-net argument to control \i(A) 
by finitely many L u . More precisely, if S£ is an e-net for S n ~ x , then 

Xi(A) < — - — max\x*Ax\ 

We will use a special e-net 

X = \J ueJ f£{u), £{u) = {x G S" 1 " 1 :x 1 =u} 
where jV is a finite subset of [0, 1] whose size depend on e. Then we can use 

nt 2 

P(L U > Xg+t) <e and 

X\{A) < 1 m&xL u 
1 — 2e ue^v 
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to build an upper tail bound for X\(A). The final step is optimizing over e to 
get the best bound (e should be of order i). 

Now, let's consider the lower tail bound for Ai(j4). We want to construct an 
x £ S n ~ l with x* Ax ps Xg. (To be precise, we want a bound for P(x* Ax < 
Xg —t)) Consider 

(6Ex ll +G)x = X e x,x£ S"" 1 

Let G be the lower right (n — 1) x (n — 1) submatrix of G, v — (32.1, ■ ■ ■ , ffn.i)*, 
x = (x2, • • ■ , x n ) , then the above equation becomes 

tel + V*X = XgX\, X\V + Gx = XgX 

Thus x = — X\(G— Xgl)~ 1 v. Of course, such an x might not be in existence at all, 
since Xg might not be an eigenvalue of A. However, this heuristic "Schur com- 
plement" argument suggests a way to construct approximate eigenvectors for A. 

When 9 > er, we know the correct value for X\ is yj 1 — ^ (since (p(u) attains 

its maximum Xg at this point). So the correct way to construct approximate 
eigenvector is 

Xl = \J l --ff2> 5 = ~ c (^ ~ ^I^v, c> 0, |x| = 1 

With this x, we have x* Ax ~ Xg. In fact, the formula for x* Ax involves L\ = 

v* Rv and L2 = v*R 2 v, where R = (G — Xgl)^ 1 ; and we can use Wigner's 

semicircle law to show Li ps —trR ps — i L 9 ps —trR 2 « a2 1 2 . After some 

x n g ' n g A —a A 

straightforward calculation, this gives us x* Ax ps A#. 
5. Deformed GOE: Proof of Theorem 3.1 

By the orthogonal invariance property of the GOE, we can assume P = 6\Ex t i + 
■ ■ ■ + 6 r E r>r . The proof is divided into two subsections, corresponding to the two 
parts of theorem 3.1. 

5. 1 . Upper Tail Bound for the Largest Eigenvalues 

We prove part (i) of theorem 3.1 in this section. 

The r > case. By the minimax characterization of eigenvalues, for 1 < i < r, 
we have 

Xi(A) = min max x* Ax 

x 1 ,— .x'-'eR" xes n - 1 n{x 1 ,--- ,x i - 1 } ± 

< max x* Ax = \AJ\ (5.1) 

x€.S n ~ 1 ,a?i— Xi-i— 

Ai is the lower right (n — i + 1) x (n — i+1) part of A. We can consider Ai as 
a linear operator from Vi to itself, where Vi = {x £ M. n : x\ = • • • = Xi-\ = 0}, 
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so that its operator norm can be controled by the maximum of |x*.Aa;| over an 
e-net. More precisely, using lemma 8.1, we have 

\\A i \\<Y—2-max{x*Ax > -x*Ax:xe% i }, 0<e<^ (5.2) 

When i = r, X r is an e-net for (S n ~ x n V r H {x G R n : x r > 0}, | • |). When 
1 < i < r — 1, ^ is an e-net for (S"™ -1 H Vj, | • |). (In the r = 1 case, we do not 
have the 1 < i < r — 1 part.) 

Our proof uses e-nets with a special structure, whose construction and bound 
for its size are rather delicate, so we defer the details to the appendix. See lemma 
8.4 and lemma 8.5. 

Let's consider ||-A r || first. In this case 

X r = V} u iz,j/ r § r {u) (5-3) 

where <§ r (u) = {x G S 1 ™ -1 : X\ = ■ ■ ■ = x r _i = 0,x r = u},u G [0,1], JV T is a 
finite subset of [0, 1] with \ jV r \ < \. When 1 < i < r - 1 

Xi = U„ e ^(u) (5.4) 

where &i(u) — {x G S" 1-1 : x\ = ■■ ■ = a^-i = 0, (xt, • ■ • ,x r ) — u},u G B r ^' l+X , 
JTi is a finite subset of with \JQ < 4(r ~ e i+1)2 (1 + 2{ [~_^ } ) M . 

With this structure for Xi, maximizing over x G Xi becomes strat- 

ified: we can maximize |a;*Aa;| for x G Si{u) to get Li^ u = max l6 ^.( u ) x*Ax, 
Li >u = max ie ^( u ) — x* Ax for each u G jVi, then select the largest among 
Li :U , L i:U , u G jVi. i.e. 

max{x* Ax, ~x* Ax : x G Xi} = max{L^ u , Li^ u : u G jVf\ (5-5) 

(5.1), (5.2) and (5.5) imply 

Xi(A) < Y—^^ax{Li, u ,Li, u : u G JQ, 0<e <^ ( 5 - 6 ) 

(5.6) is the starting point for building an upper tail bound for Xi(A). The 
next step is to eatablish a tail bound for each Li, u , Li, u , then take a union bound 
over u G ,jVi. We keep e as a free parameter along the way and optimize over e 
at the end. To get an upper tail bound for L,^ u (similarly for L,^ u ), we will prove 
E[Li tU ] < Xg i using Slepian's lemma as stated below; then use concentration of 
measure inequality for Gaussian processes. 

Proposition 5.1. (Slepian's Lemma) Let {X t ) tG T o-nd (Y t ) tG T be two centered 
Gaussian processes defined on the same finite index set T. Assume E\X S — 
X t \ 2 < E\Y S - Y t \ 2 for all s,teT. Then £[max teT X t ] < £[max teT Y t ]. 

Remark 5.2. Although Slepian's lemma is stated for Gaussian processes defined 
on a finite index set, we will use it on Gaussian processes defined on infinite 
index sets. This is justified by an approximation procedure and we omit this 



Minyu Peng /Eigenvalues of Deformed Random Matrices 



9 



routine matter. This remark applies also to the application of proposition 5.3 
and proposition 6. 1 . 

A proof of proposition 5.1 and its generalization (proposition 6.1 below) can 
be found in [13]. 

Let X x = x*Gx,x e S 11 ^ 1 , write u = (m, ■■■ ,u r ) € B^ l+1 , then 



L>i t u = / Oj u2 j + max X x 

3=1 

r 

Li. u = - \ 9jU 2 + max —X x 

J — ' 



(5.7) 
(5.8) 



Let Y x = ■^YTi=rJ r \ x i UJ h x e & n *i wnere uv+i, ■ ■ ■ ,uj n arc i.i.d. j\f(0,l) 
random variables. Then for x, y € S'iiu), we have 

n 

E\X X ~X V \ 2 = E\Y, - y k y 3 )gk, 3 \ 2 (5.9) 

n 2fT 2 cr 2 

= H^fc~ y fc) 2 ~ + £ 4(x k x 3 -ykyj) 2 — 

k=l l<k<j<n 

- ^! 



■((£^) 2 + (E^) 2 - 2 (E^) 2 ) 



= — (2-2(x,y) 2 ) 
n 



k=l 



4rr 



2\ 

4a 2 



\x-y\ 2 (l-(x,y)Y 

n n 

< —\x - y\ 2 = E\Y X - Y y \ 2 



Using Slcpian's lemma, and noticing that the maximum of {Y x : x € ^(u)} is 
reached when (av+i, • • ■ , x n ) is a multiple of (w r+ i, ■ • • , w„), we have 

E\ max XJ, M max — X x ] 



< E[ max Y x ] = E 

xeSi(u) 



\ J=H- 



< 



£?[ £ w 2 ] (by Cauchy-Schwarz) 

j=r+l 



2c yl 



1 - - < 2<rv/l- |u| 2 
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This together with (5.7), (5.8) imply 

r 

E[L i)U ] < + - M 2 = <p(u) 

j=i 

r 

E[L i>u ) <-J2 °o u2 j + Wl -H 2 < <p(u) 

When 6i > a, the maximum of <p[u) is 0j + 5r and is attained when u = 
(^1 - §£, 0, • ■ ■ , 0); when 6>, < cr, the maximum of is 2a. Therefore, the 
previous two inequalities imply 

E[L itU ], E[L itU ]<X ei (5.10) 

Now we want to apply concentration inequalities to Li lU , Li lU , which arc sumprema 
of Gaussian processes. We need the following proposition, which is proved via 
the Gaussian concentration inequality (proposition 8.6 in the appendix). 

Proposition 5.3. Let (-^»j)i<i<n,l<3<m be a centered Gaussian process. Then 
fort>0 



P(minmaxXi J > E[minmax.Xij] + i) < e 2 



P(minmaxXjj < £[minmaxXj j] —t)< e 2max *' 3 EXi ^ 

i 3 i 3 

Proof. Wc can find i.i.d. 7V(0, 1) random variables Y\, ■ ■ ■ , Y nm and A e E. nmxnm 
so that Xij = J2k=i a (i~i)m+j,kYk,l < i < n, 1 < j < m. Define gij(y) = 

a {i-i)m+j,kVk,g{y) = mini<j< n maxi<i< TO 9i,j{y),y € R nm - Then mini maxj X it j = 

9(Y). 

Let y,y e M nm , g(y) = g iujl (y), g{y) = g% 2 ,j 2 {y)- Assume g(y) > g(y), 
maxi< j< m gi 2 ,j{y) = 9i 2 ,h(y), thcn 

\g{y) - g(y)\ = 9h,h (v) ~ 9i 2 ,j 2 {y) = ,min max g itj (y) - g i2 ,j 2 (y) 

l<i<n 1 < 7 < 771 

< max g i2 ,j(y) - g l2 , j2 (y) = g l2 ,j 3 {y) ~ max g i2 ,j(y) 

l<j<m 1 S 7 S rn 

< 9i 2 ,j 3 (y) - 9i 2 ,h(y) < maxnmxl (2/) _ g . ^| 

» j 

Hence g has Lipschitz constant bounded by the norm of the operator A : 
(R™"\ l 2 ) -> (R" m , loo), which equals 



max 

i,3 



nm 

Ea?. „ ... = . /max£X 2 , 

fc=i 



Then we can apply proposition 8.6 to conclude. 



□ 
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Remark 5.4. Using a similar arguemnt, proposition 5.3 generalizes naturally 
to: maxmin, max min max, min maxmin, etc. We will only use the "max" ver- 
sion (i.e. n = 1 case) in the argument that follows. However, we will be using 
the full "minmax" version in the proof of theorem 3.2. 

Since E[Xl) = ELi4m9l k }+E k <^xl^E{9h} = S"~\ wc 

can use (5.10) and apply proposition 5.3 to get 

P(L itU > X 9i + t) < e - ^, t>0 
P{Li,u > X 6i +t) < e _ 2&, t>0 

Let 6 = 2(a7 +t) ' < a < 1 ' thcn ( X _ 2e )( A e» + l ) = X e z + at. Using (5.6) and 
the above two inequalities, we have 

P(Xi(A) > X 8z +t)< P(max{^,„, L^ u : u G ^} > (1 - 2e)(A 9i + t)) 

< P ( L i,u > M t + at) + 51 P (^v" ^ A e. + at ) 

< 2\,yr i \e~ 21 ££~ 

When t > V 2 (^+^ choose o = i(i + Jl - 8(r ^+ 1)j2 -), then a>l-S>h. 

This choice of a guarantees < e < | (This is needed when we apply (5.6) 
and lemma 8.4). When i = r, we use < =; when 1 < i < r — 1, we use 
< 4(r-t+i) ^ _|_ 2(r-i+i) y-\ After some simplification, we get 

P(Xi(A) > X 8z +t) < 2C r _ i+Mj (n) ■ < 2G r _ i+1A (n) • e^'^ 

This finishes the proof of the r > case. 

When r = 0, X\(A) = max l6 sr.-i X x with X x = x*Gx. Define Y x = ^= XjUJj, x G 

S 1 ™ -1 , wi, • • • , w n are i.i.d. A/"(0, 1) random variables. Similar to (5.9), we have 
E\X X - X y \ 2 < E\Y x ~Y y \ 2 ,\lx,y G S™" 1 ; thus £[Ai(A)] < £[max ieS „-i F x ] < 
2cr by Slepian's lemma. Applying proposition 5.3, we get 

P(Xi(A) >2a + t) < e~^ , t>0 

5.2. Approximate Eigenvectors 

We prove part (ii) of theorem 3.1 in this section. 

The idea of the proof is to construct, for each i, 1 < i < ro, an approximate 
eigenvector x, i.e. x G S' n_1 , with x* Ax rj A^. 

Let m = n — r, \p^G be the lower right to x to submatrix of G, then 

G G GOE(m, £). Let 2a < A < A 9i , P = {A max (G) < A }, then (3.3) in (i) 
implies 

m(A -2 CT ) 2 

P(B C ) < e ^ (5.11) 
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We will only construct approximate eigenvectors on the event B\ the indicator 
1b might not be mentioned at every instance. 

Let x G S 71 ^ 1 be such that x\ = ■ ■ ■ = Xi—\ = Xi+\ = • • • = x r = 0, 



and 



t a Rv 

(xr+i,--- ,x n ) =-~ a — 7= (5.12) 



(fli.r+l, • ' ' ,9i,n) = \ — <?V 

V n 

The random vector v defined above has i.i.d. Af(0, — ) coordinates. R = (G 
A^/) -1 , L\ = v* Rv, L 2 = v*R 2 v. A straight forward calculation shows 



\ 9i - x*Ax - (1 - ^jl^L) 2 -^- - 9i4 {l - ^) (5.13) 



2 



2 



The next step is to show L\ w — j-,L 2 ~ grr^2- ^his ma kes the grouping of 
terms in (5.13) clear: the four terms are all small and we can take a union bound 
to get a deviation inequality for Xg i — x* Ax. 

There are two sources of randomness in Ly. v and G; and they are inde- 
pendent. We break the task of building deviation inequalities for Lj into three 
steps. The first step is to show that, conditioning on G, Lj concentrates around 
E[Lj\G), see lemma 5.5. The second step is to show that E[Lj\G] concentrates 
around E[Lj], see lemma 5.6. The third step is to show « — j-,E[L 2 ] ~ 

e i^ a 2 , see lemma 5.7. 

Lemma 5.5. t > 0. On the event B, we have 

P l Ll - IfrR <-t\G)< e -im(^i+2(x et -x )t-if 
m 

p(L x - —trR >t\G)< e-W^-Ao)^ 2 
m 

P(L 2 - -trR 2 <-t \G)< g-W^-Ao) 4 * 2 
m 

P( L . 2 _ IfrR 2 >t\G)< e-Xv^+^-M 2 *-!) 2 



Conditioning on G, R is a constant matrix. Since the distribution of v is 
orthogonal invariant, we can diagonalize the quadratic forms L\ = v*Rv,L 2 = 
v*R 2 v and apply proposition 8.7 to get lemma 5.5. The bounds in the first 
and fourth inequalities are complicated. When we apply lemma 5.5, we will 
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use e _ 3 m ( 1- ' 5 ) O^-Ao) 2 * 2 as a bound in the first inequality; this is valid when 
(Xg i — Xo)t < njfp ■ Similarly for the fourth inequality. 

Lemma 5.6. t > 0. On t/ie event B, we have 

P(—trR - E[—(trR)l B ] > t (or < -t) ) < e -|w(A 6i -Ao) 2 i 2 

TO TO 

P(—trR 2 - E[—(trR 2 )l B ] > t (or < -t) ) < e -|™(A ei -A ) 4 t 2 

TO TO 

Proof, f = ^(trR)ls is a function of the random variables gij; and these gij 
are independent. If we can divide {gij} into several groups such that each group 
has limited influence on /, then McDiarmid's Inequality (proposition 8.8) will 
yield a concentration inequality for /. 

Let Gj be the submatrix of G obtained by deleting the j-th row and j-th 
column. Then proposition 8.9 implies 

Ai(G) > Ai(Gj) > A 2 (G) > A 2 (G,) > • ■ ■ > A m (G) (5.14) 

Let ip(x) = Xf) — , x € (—00, Aq]. We have 



- -{trR)l B + -(tr(G 3 - AeJ" 1 )^ 
m m 

1 ni m — 1 

1 m — 1 

= -IE (^( A '(G)) - ^(G,))) + p(A ro (G))]l B 

TO ^— ' 

The monotonicity of tp(x) and (5.14) imply 

m— 1 

|-[E( V (A ; (G))- V (A ; (G,-))) + ^A m (G))]l B | 



z=i 



< -^(Ai(G))l B < 



TO m(Xg l — Aq) 

So / is within m( - A() 1 _ Ao - ) distance to ^(tr(Gj — A6/ ; ) _1 )1b. Therefore, changing 

the j-th row and j-th column of G will leave / to vary in an interval of length 

2 

m(Xg i — A ) ' 

Divide {<7zj} into to groups: X s = {gzj| max(Z, j) = s}, s = r + 1, • • ■ , n. 
Then each X s influences / by at most m ^^\ ) ■ Now we can apply proposition 
8.8 to get the first inequality. The proof for the second inequality is similar. □ 

Let g(z) = ^ ^^ r-dx be the Stieltjes transform of the semicircle 

law. Then 

gM = -i-, 9 f M = -^—3 ( 5 - 15 ) 
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Lemma 5.7. There exists a constant C such that 
\E[-(trR)l B }-g(X e ,)\ < 



m ' m(Xg i — Aq) 

\E[-(trR 2 )l B ]-g'(Xe t )\ < 



m ' m(Xe i — Xq) 2 

Proof. This lemma is proved using lemma 8.10, we will adopt the notation of 
lemma 8.10 in the following. Let ip(x) = A ^ 1 _ x l x <\ a . the hrst inequality can be 
reformulated as 

| J ip{x)dEF m {x) - J if{x)dF(x)\ < ^|M|max (5.16) 

Lemma 8.10 says 

(x)dEF m (x) - I ' 4>{x)dF{x)\ < -Ma** 

J m 

if (j>(x) = c ■ l(_oo, a ] ( x )- To prove (5.16), we approximate <p(x) by ip A {x) = 
c iT(_oo,ai] ( x )-> where A is the division a% < • • • < <Xfc = A , and the coefficients 
are Ck = ip(a,k),Ci = <p(a 2 ) — <p(a,:+i),i < k — 1. Since <p(x) is nonnegative and 
monotone, we have X) l c «l — 2 1| V 3 1 1 max, thus 



f C 2C 

VA (x)dEF m (x) - / <p A (x)dF(x)\ < V -\ci\ < —I 
J z — ' m m 



<P 



Let ||A|| = max(ai — a,_i) — > 0, we get (5.16). The second inequality is proved 
in a similar fashion. □ 

Combine the previous three lemmas, we can establish deviation inequalities 
for L\ and L 2 as follows. When < {Xg i — X )t < ^ 5 ^^~ S ^ 

P{Lx - g(X 6i ) > t + - ° - . ) < 2e"^ i(Ae ^ Ao)2( TT^ )2t2 (5.17) 
m(X 9i - A ) 

(1 i~~r\ \ \Vt 1—5 \2 + 2 

P(Lx - g{X 6i ) <-t- ° J < 2e-^^- A °) Ci+7J=?) 4 
m(A e! - A ) 

WhenO<(A e! -A ) 2 t<^(i±V|^) 

P(L 2 - g'(A e! ) > t + - C - .„ ) < 2e-^" (A,, -- Ao)4( ^fc )2t2 (5.18) 

m(X 0i - x y 

P(L 2 - 5 '(A e J <-t- , - - , 2 ) < 2e -^" (A ' , -- Ao) 1+73^5^ 4 

m(A ei - x y 

The first two terms in (5.13) are bounded above by 2r£ + |<7»,j|. Using (5.15), 
(5.17) and (5.18), we can build deviation bounds for the last two terms in (5.13) 
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(the routine details are omitted). We choose Ao = \{2a + AgJ. The end result 
is 

P({A 9l - x*Ax > t + dar/n} n B) < 8e - 4 <^+-) a (5.19) 

Let Bj = {Ae ; — x* Ax > t + Cicr/n} n -B; then the 1st, ■ ■ ■ ,i — th approximate 
eigenvectors we built are valid on B n (U* =1 .Bj) c , so 

P(Xi(A) <\g.-t- Ciar/n) (5.20) 

< p{ {B^{if ]=1 B 3 ry ) < p{b c )+y, p ^) 

3=1 



6. Spiked Population Model: Proof of Theorem 3.2 and Theorem 3.3 

Since the distribution of G is orthogonal invariant, we can assume 

=diag{0i,-" ,e r+s ,i--- ,1} 

The proof is divided into three subsections, corresponding to part (i) of theorem 
3.2, part (i) of theorem 3.3, and part (ii) of both theorems. As mentioned before, 
the proof of part (i) of both theorems follows the same idea used in proving part 
(i) of theorem 3.1; the proof of part (ii) of both theorems is similar to the proof 
of part (ii) of theorem 3.1. 

6. 1 . Upper Tail Bound for the Largest Eigenvalues 

We prove part (i) of theorem 3.2 in this section. 

The r > case. By the minimax characterization of eigenvalues, for 1 < % < r, 
we have 

Xi(S n ) — min max x* S n x 

< max x* S„x = of (6-1) 

xeS* > - 1 ,ai=---=x i _i=o 

<Tj is the largest singular value of the lower (p — i + 1) x n submatrix of -^E^G. 
Let X XtV = x*(^EiG)y,x G S p ~^, y G S n "\ then 

a, = max.{X XtV : x G S^ 1 nV t ,y E S"- 1 } 
where Vi = {x £ M. p : x\ = ■ ■ ■ = Xi-i = 0}. Using lemma 8.2, we have 

(Ti < — !— maxl^.j :x£^r t ,ye S"" 1 }, < e < 1 (6.2) 
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When i = r, X T is an e-nct for (S^ 1 n V r n {x S W : x r > 0}, | • |). When 
1 < i < r - 1, is an e-net for (S^" 1 n V*, | • |). 
By lemma 8.4 and lemma 8.5, we can pick 

= \J ue .y/ r S r (u) (6.3) 

where S T (u) = {x <G S^ 1 : x\ = ■ ■ ■ = x r -\ = 0, x r = u},u <G [0,1], JV T is a 
finite subset of [0, 1] with \ jV r \<\. When 1 < i < r - 1 

^i = U^^(ti) (6.4) 

where £i(u) = {x g : xi = • • • = Xj_i = 0, (x i; ■ ■ ■ , x r ) = u}, u g -B 2 '~ t+1 , 
^ is a finite subset of B r 2 ~ i+l with |^| < 4(r ~^ +1)2 (1 + H£z*£2)r-i. 
Define L ij1t = max ie i i( „) fl( : S ,-i X Xll/ , then 

max{X X)!/ : x € Jije = max{L iiU : u g Jfy (6.5) 

(6.1) and (6.2) and (6.5) imply 

VHSn) < max{L itU :«e4 < e < 1 (6.6) 

We will be using (6.6) to establish an upper tail bound for Xi(S n ). As in the 
GOE case, we will prove a tail bound for each Lj iU , then take a union bound 
over u € JVi- 

Let u = (ui, • • • , u r ) g i?2 _t+1 , x g Si{u), y g consider 



y -.I 

in 



r n p 

E^ + i-m 2 E^ + -^ E 4& 

A;— * j — 1 j=r+l 



(6.7) 



in which u>%,- ■ ■ ,u3 n ,Pr+ii ' ' ' > A> are i-i-d. A/"(0, 1) random variables; and 

X (x r _[_^ , ■ • • , Xp) (^r+l-^r-f 1 j ' ' ' j ^r+s^r+s; ; ' ' ' 3 ^p) 

Then for x,x G $i(u),y,y g S*" -1 , we have 

-ElXr.i/ — ^5 : y| 2 (6.8) 



,2/ •"■x.ai 

r n p n 



■J 2 



= E E^ feU *^ ~ 0ku k yj)g k ,j + E E^** ~ v'kV^ak,. 

k—i j—1 k—r+lj—1 
1 r n 1 p n 

= jEEtt-fe) J + ^ E E((^) 2 ^ + (^) 2 ^- 2 ^^) 

A;— z j — 1 fc=r+l J=l 

= ^(E - y\ 2 + ^(i^'i 2 + i^'i 2 ) - fa> *Xv> v) 

k=i 

= -(j2°M + 1 -H 2 )\y-y\ 2 + -^-^'\ 2 

n * — ' 
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-(i-M 2 -(x',i'))(i-(y,y)) 



<-(£ 014 + 1 - H 2 )\y y\ 2 + -W- z'\ 2 = E\Y Xty - y S j 



The maximum of {Y XtV : x G $i{u),y 6 S" } is reached when x' is a multiple 
of (/3 r+ i,--- , j3 p ) and y is a multiple of (wi,--- Since 9 r +j < 1, |a/| < 

KiCr+l) ' ' ' i Ep)| = V 1 _ l M | 2 > we nave 
1 



HIcLX y — 



< 




\ j=i v \ j=r+i 



E * 



E * 



=r+l 



Using Slepian's lemma, we have 



E\ max X x v ] < E\ max Y x J 



< 



x E« + 1 



Therefore 



\ 



p — r 



When 0? > 1 + v/c, the maximum of is \/0f + c • — ^ ^_ -. and is attained 



when 



u = (* 



l) 2 -C 



-,o,--- ,0) 



f-l)(0 2 -l + C ) 
When # 2 < 1 + y/c, the maximum of is <p(0) = 1 + \[c. Hence 



E[L hU ] < ^/Xg~^ 



(6.9) 



(6.10) 



Since E[Xl y ] = i(ELi« 
proposition 5.3 to get 



I ^ r fl2„,2 , \ rr .l\2\ ^ Inl 



) < r^ 2 ; we can use (6.10) and apply 



P(L itU >VXe~c + t)<e^f, t>0 
Let e = j=^- t , < a < 1, then (1 - e)( A /A 6i , c + i) = ^/A 9i , c + at Using (6.6) 
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and the above inequality, we have 



P(\/Xi{S n ) > + t)< P(max{L Vii :«€4}>(1- e)( V / ^ + *)) 



The final step is to use our bound on \ J\f%\ and optimize over a £ (0, 1). When 
t > VpEEHi , choose a = 1(1 + Jl - ^P^), then a > 1-6 > |, < e < A 

_ ^/<5(l-5)n' 2V V "* 2 y ' — -3' - 3 

(as needed in the use of (6.6) and lemma 8.4). When i = r, we use \^K\ < -; 
when 1 < i < r - 1, we use |^| < 4(r "^ +1)2 (1 + g ^r 1 ) r "'- Then 



This finishes the proof of the r > case. 

When r = 0, ^Ai^) = max xeSP -i^ 6S „-i X^, = x* (^£ 2 GQy. For 

.x G S^" 1 ,!/ G S n -\ define 



1 " 1 P 



v j=i v fc=i 

where uji, ■ ■ ■ , oj„, • • • , /3 p are i.i.d. 7V(0, 1) random variables and 

x' = {x[, ■■■ ,x' p ) = (Oixi, ■ ■ ■ ,9 s x s ,x s+1 , ■■■ ,x p ) 

Similar to (6.8), we have E\X x , y - X^| 2 < E\Y X:V -Y x ^\ 2 ,Vx, x € S^" 1 , y,y € 
S n -\ Thus Eiy/X^Sn)] < E[max xeSP -i tVeS ^Y Xty ] < 1 + ^fc by Slepian's 
lemma. Using proposition 5.3, we have 



P(y/Xi(S n ) >l + V^ + t) < e _ T-, i>0 



6.2. Lower Tail Bound for the Smallest Eigenvalues 

We prove part (i) of theorem 3.3 in this section. 

The s > case. By the max-min characterization of eigenvalues, for 1 < i < s, 
we have 

An-i+i (S n ) = max min x* S n x 

VCKP,dim V=p-i+l x£SP- 1 nV 

> min x* S n x = uf (6.11) 
where Wi = {1 e F : x r+s _i + 2 = ■ ■ ■ = x r+s = 0} and 



^ = min max X x y , X xy = x* (— ^£ 2 G)y 
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Using lemma 8.3, we have 



min max X x v < Ui + eW Ai(S n ) (6.12) 

When r = 0, i = s, 3£ s is an e-net for (S^ 1 D W s D {x G W : x x > 0}, | • |). In 
all other situations, S£i is an e-net for (S v ~ x n Wi, \ ■ |). 

By lemma 8.4 and lemma 8.5, when r = 0, i = s, we can arrange 

X a = \J ue ,^£ s (u) (6.13) 

where £ s {u) = {x G W : X2 = • • ■ = x s = 0, x\ = u},u G [0, 1], Jf a is a finite 
subset of [0, 1] with \ jY s \ < -; in all other situations 



Xi = U ue ^i{u) (6.14) 



where for u G B r 2 +S ~ i+1 



S l {u) = {i£l p : x r+s - l+2 =■■■ = x r+s = 0, (xi, ■ ■ ■ , x r+s - i+1 ) = u} 

JVi is a finite subset of ^+ s - i+1 with |^| < *fc±2zi±li!(i + 2fc±£zg0)r+ s -i_ 
Define L,.„ = min^^) max 9gS n-i Xc.^j trien 

min max X x „= min Li u (6.15) 

(6.11), (6.12) and (6.15) imply 

min L Lu < J X p - i+ i(S n ) + X 1 (S n ) (6.16) 

To get a lower tail bound for A p _i+i(S' n ), we will establish a lower tail bound 
for each Li tU then take an union bound. The y/Xi(S n ) term will be dealt with 
using the result of part (i) of theorem 3.2. 
For x G d>i(u), y G S™ -1 , consider 



Y = JL 



r + S — i+1 71 1 ^ 

E ^ + 1 -H 2 E^- + ^ E x ^ 

fc=l j = l » k=r+s+l 

in which wi, • • • , w„, /3 r + s +i, • • • , /3 P are i.i.d. A/"(0, 1) random variables. Similar 
to (6.8), for x, J G $i{u),y,y G S 1 ™ -1 , we have 



2 P 
= -(l-\u\ 2 - E M)(l-(2/,y)) (6.17) 

r7 * — » 



k=r+s+l 



This quantity is always non-negative, and equals zero when x = x. To proceed, 
we will use the following generalization of Slepian's lemma. 
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Proposition 6.1. [13] Let {Xi,j)i<i<n,i<j<m and {Yij)i<i< n ,i<j<m be two cen- 
tered Gaussian processes. Assume 

— Xi^] 2 < E\Yij — Yi t k\ 2 
E^ - X hk \ 2 > E^ -Y Lk \ 2 , i + l 



Then 



E [min max Xi j] < [min max 1* j] 

i j i j 



This proposition implies 

E\ min max X x y ] > E\ min max Y x v ] 

xe£i{u) yes™- 1 xeSii^yeS™- 1 



r+s— 2+1 



k=l 



E 



fe=r+«+l 



> 



r+s-i+1 



fc=i 



r(|) 



>(i- 



In' 



r+s — i+1 



M E 

\ fc=l 



+ 1 - |u| 2 - - |u| 2 , c = P r ' S 



The last step uses Stirling's approximation for T-functions. Therefore 

e\ua > 



r+s-i+1 



E e M + 1 

\ k=l 



\u\ 2 ^ = <p(u) 



2)1 



2n 



To avoid heavy notation, let ftj = &,. +s _.j +1 ,l < i < s, i.e. 6 2 is the i-th 
smallest eigenvalue of the population covariance matrix S. When 8 2 < 1 — vV, 



the minimum of (p(u) is w # 2 + d ■ ~ 2 i - and is attained when 



u= ((),••• ,0, 



(0 2 - l) 2 - c' 



-) 



? 2 -l)(tf 2 -l + c')' 
When Of > 1 — v / c', the minimum of <£>(u) is tp(0) = 1 — \fd . Hence 

V 1 



E[L i>u ] > J X§ , \ 



2n 



(6.18) 



(6.19) 



Since E[XlJ = i(ES" i+1 « + 1 - M 2 ) < we can use (6.19) and 



apply proposition 5.3 to get 



P(L hU < - - t) < 
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Let (1 — a)t = et, < a < 1, (6.16) and the above inequality imply 

V 1 



P(VVm(Sn) < Jh~ c ,~ - t) 



< P( min L hU < ./\~ - ^1 - at) + P(VAi(S„) > t) 



2n 



-2.2 



<\j%\e +P(v/A^) >?) 

We will use the bound for |^| and make an appropriate choice of a and t. When 

4(r+s-i+l)8j 



r > 0, we pick t = \/Xe llC +t, a = |(1 + y 1 — — — ); when r — 0, we 

pick i=l + *fc + t,a=\{l + v/l - 4(r+ n s :' + ^). This gives 



, Q y 1 (1-y ret z 

P(yV<+i(Sn) < " *) < C l '(n)e"^^ 

This hnishes the proof of the s > case. 
When s = 0. we have 



A p (5„) = min max X x _ y , X xy = x*(—='E^G)y 

xesp- 1 yes™- 1 y/n 

For u £ B£, let = {x £ S^ 1 : ,x r ) = u}. For as e #(u),y £ S r ' 
dchnc 



Y -J 



fe=l 7=1 * fc=r+l 



where oj\,--- , a; n ,/3 r +i, ■ ■ ■ arc i.i.d. A/"(0, 1) random variables. Similar to 
(6.8), for x, x £ £{u),y,y £ if?™ -1 , we have 



2 p 

^,»-i's I sr--B|X !r , 1 ,-X SiS | 2 = -(i-H 2 - ^ a:fc5fc)(i-(y,y)) 

This is nonnegative, and equals zero when a; = x. Using proposition 6.1, when 
r = 0, we have 

E\ min max X x „1 > E\ min max Y x „1 > 1 — \Tc! 
when r > 0, we have 

M min max X x v ] > E\ min max Y x „1 > 1 — \/<7 

le^MseS"- 1 xe<?(«) yes"- 1 2n 
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The derivation is similar to the s > case. 

Let L u = v&v&xeg(u) max y6 5n-i X x>y , the previous two inequalities and propo- 
sition 5.3 imply 



P{^X p (S n ) < l-y/d-t) <e-^, t>0,r = (6.20) 
This is the result for the r = s = case; and 

P(L U < 1 - yfit- -i- -t) < e~^, t>0,r>0 
2n 

When r > 0, we can use an e-net argument analogous to the one used in 
establishing (6.16) to get 



min L u < J X p (S n ) + ey/ Xi(S n ) 



jY is a finite set with \ jV\ < § when r = 1; and < ^-(1 + ^r^) 1 - 1 when 
r > 1. 

Let (1 — a)t = e(^/Ae liC + t),0 < a < 1. Using the previous two inequalities 
and the result of part (i) of theorem 3.2, we have 



p(^x p (s n )<i-Vd-^-t) 

Q 

< P( min L u < 1 - V? - - 1 - at) + P(y/Xi(S n ) > y/X^~ c + t) 



Using the bound for \^y\ and choosing a = ^(1 + yj 1 — ^r) gives 



P{\jK{Sn) < 1 -Vc 7 - y -t) < 2C rA (n)e 
This together with (6.20) finishes the proof of the s = case. 



6.3. Approximate Eigenvectors 

We prove part (ii) of both theorem 3.2 and 3.3 in this section. The proof uses 
the same method for proving part (ii) of theorem 3.1. So we will focus on the 
construction of approximate eigenvectors and omit other details. 

Let's first consider approximate eigenvectors associated with the largest eigen- 
values. Let G be the lower (p — i — s) x n submatrix of -j^G. Let (1 + yfc) 2 < 

A < Xe^ c , B = {X max (G*G) < A }, then (3.7) in (i) implies 



P(B C ) < g-^fVAo-l-v'c) 2 



(6.21) 
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We will construct approximate eigenvectors on the event B. 

Let x € 5' p_1 be such that x\ = ■ ■ ■ = Xi-i = Xi+i = ■ ■ ■ = x r+s = 0, 



(e -- 1)2 - c „ and 



(e?-i)(e?-i+c)' 



x = (x r+s+1 ,--- ,x P Y = -Jl -x\ ..f^' f (6.22) 

v V AL 2 + L\ 

1 , M 

v = —7=(9i,l, ■ ■ ■ ,9i,n) 

where A = X 6i>c> R = (GG* —XI) -1 , S = (G*G— A/) -1 , L\ = v*Sv,L 2 = v*S 2 v. 
Then x*S n x = \y\ 2 with 

y = (4=<3*)£*a; = e iXl v + G*x 



Jl - x 2 _ 

~ X * (I + XS)v 



\J XL 2 + L~i 
= (al - XbS)v 



where a = QiXi —b,b= )J x \ . Therefore 

X - x*S n x = A(l - Xb 2 L 2 ) - a 2 \v\ 2 + 2abXL 1 (6.23) 

To build a deviation inequality for A — x*S n x, we will prove 1 — Xb 2 L 2 ~ 0, 
a ~ 0, then take a union bound in (6.23). This is accomplished by proving 
Li m g(X), L 2 ~ g'(X), where g(z) is the Stieltjes transform of the Marcenko- 
Pastur distribution, see [21]. More precisely, as n — > oo and holding the ratio p/n 
constant, the spectral distribution of G*G converges to a deterministic limiting 
distribution with Stieltjes transform 



. . c-l-z + v/(z - 1 - c) 2 - 4e 



£l«5(A) = -^, ^2 ~ ff'(A) = ni((n 2 _ -i\2 ^ ( 6 ' 24 ) 



2z 

Thus we can apply the same argument used in proving part (ii) of theorem 3.1 
to get 

l) 2 -c) 

The only modification is that we need proposition 8.11, which is a rate of con- 
vergence result for sample covariance type matrices, instead of proposition 8.10. 

The above is the proof of part (ii) of theorem 3.2. The proof of part (ii) 
of theorem 3.3 is similar. In this case, we will build approximate eigenvectors 
on B = {A mi „(G*G) > A }, X 8r+3 _ i+1 , c > < A < (1 - V^ 7 ) 2 , and use (3.10) 
and (3.11) to get a bound for P{B C ) similar to (6.21). The construction of 
approximate eigenvectors is done by changing i to r + s — i + 1 in the above 
argument. 
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7. Summary 

In this work we considered the extreme eigenvalues of matrices from the de- 
formed GOE and the spiked population model. We proved tight deviation bounds 
for these eigenvalues. An interesting direction to go next is to study these prob- 
lems when the Gaussian distribution is replaced by a stable distribution. This 
will complete our picture about the eigenvalues of deformed random matrices. 



8. Appendix 

This appendix collects some auxiliary propositions. 

Lemma 8.1. Let A £ R™*^, < e < \, S£ is an e-net for | • |), then 

\\A\\ < — — — maxluMd 
11 11 ~ 1 - 2e uslX 1 1 

The same inequality holds if 2£ is an e-net for | • |) where S"^ 1 — {x £ 

S 71 - 1 : x x > 0}. 

Proof. Let x £ 5 n_1 be such that \\A\\ = x*Ax. Wc can arrange x £ (by 
changing x to —x if xx < 0) when S£ is an e-net for Choose y £ X which 

approximates x as \x — y\ < e. By the triangle inequality, we have 

\x*Ax - y*Ay\ = \x*A(x - y) + (x - y)*Ay\ < 2e\\A\\ 

It follows that \y*Ay\ > \x* Ax\ - 2e\\A\\ = \\A\\ - 2e\\A\\, \\A\\ < j^\y*Ay\ < 
j^maxuesc \u*Au\. □ 

Lemma 8.2. Let A £ W xn , < e < 1, 3£ is an e-net for {S p -\ \ ■ \), then 

\\A\\ < max x* Ay 

1 - e ze^'yes™- 1 

The same inequality holds if 3£ is an e-net for {S^~ , \ ■ \ ) where S^T 1 = {x £ 
SP- 1 : xx > 0}. 

Proof. Pick xo £ S p ~ 1 ,yo € S" 1-1 so that ||^4|| = x$Ay . We can arrange xo £ 
1 in the S^T 1 case. Find i € with \x — xo\ < e, then 

\x*Ay - x* Ay \ < \\A\\ ■ \x - x \ ■ \y \ < e\\A\\ 

Thus 

x*Ay >x*Ay -e\\A\\ = \\A\\ - e\\A\\ 

\\A\\ < x*Ayn < max x* Ay 

□ 
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Lemma 8.3. Let A G W xn ,p < n, s m i n (A) is the smallest singular value of A. 
Let 3£ be an e-net of (S' p_1 , | • |), then 

min max x* Ay < s min (A) + e\\A\\ 

The same inequality holds if 3£ is an e-net of (S 1 ^ , | • |) where S^T 1 — {x G 
SP- 1 : xi > 0}. 

Proof. Let s m i n (^4) = x^Ay n = \A*xq\,xq G S p ~ 1 ,yo G S*™ -1 , we can arrange 
xq G S? 1 when is an e-net for S+ _1 . Find i G S£ so that |x — x§\ < e, then 

min max x* Ay < max x* Ay = \A*x\ 

xe&yeS"- 1 yes"- 1 

< \A*x Q \ + \A*(x-x )\<s min (A)+e\\A\\ 

□ 

Lemma 8.4. Let B™ = {x G M m : Yh=i A ^ !}> / or x > 2/ e B ™> rfe /^ e 
= \]\x-y\ 2 + {^T^ Jxf-V^~ W) 2 

Then 

(i) : p m is a metric on B™. 

(ii) : For < e < -|, there exists an e-net for ([0, l],pi) with size < |. 

(Hi): For < e < i, when m > 2, there exists an e-net for (B™,p m ) with 

size < 4z£(l + r^r)" 1 " 1 . 

— e v (m — l)e ' 

Proof, (i): To check the triangle inequality, we use y/(a + b)(c + d) > ^/ac+^/bd 
to lower bound the cross term in the expansion of (p m (x, y) + p m (y, z)) 2 

( Pm (x,y)+p m (y,z)) 2 > |a; _j / |2 + ( v /l_| a; |2_ v / 1 _ ^2)2 

+ \y - A 2 + (V^W V^W) 2 

+ 2(\x - y\ -\y z\ + \VT^W V^W\ ■ W^ 7 W V^W\) 
> (x-y + y-z) 2 + (^~ \xf-^l~ W + v/l - W~ \fi^W? 

= p m (x, zf 

(ii): Let 1 > r/i > r][ > 772 > 7]' 2 > ■ ■ ■ > be such that 

Pi(x,m) < e, x G [771,1] 

Pi(x,r)i) < e> x e [j7i,T7i],Vi 

Pi (x, 77 i+ i ) < e, x G [»7i + i , 77-] , Vi 

Then {771, 772, ■ • ■ } is an e-nct. We can pick 771 = 1 — For < it < v < 1, by 
applying the mean value theorem on yl — u 2 — \/\ — v 2 , we have 
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Writing r}i = 1 — x;e 2 , 77- = 1 — x^e 2 and using the above inequality, we have 

/ / n ^ Vi - n'% ( x 'i- x i) e 
y/l-Vi Vxi(l + rn) 

Hence the condition on 17^ , 77^ is satisfied if the following hold 



c- < Xi + \/xi(l + rjij 



x i+ i < x\ + ^-(1 + 74) 



Therefore, we can construct 77^,77- inductively by letting Xi 2 7 X^ Xi ~ r 
\J x% , x^j^\ x^i -\- 2-^/x7- 

By induction on i, it is easy to show that X{ > \i 2 + Let fc be the 
smallest positive integer such that (|fc 2 + \k)e 2 > 1, then 77^-1 > > Thus 
{771, • • • , r/fc-i, 0} is an e-net for ([0, 1], pi), and 



,311, 2 
fc<f^ + ---] + l<- 

The last step uses the condition < e < ^. 

(iii): The construction of e-nct in higher dimension is based on two obser- 
vations: (a): The restriction of p rn on ct fixed radius is p\ , i.e. p m (sx, tx) = 
pi(s,t),xe 5 m_1 ,s,te [0, 1]; (b): The restriction of p m on the sphere 5 m_1 (r) = 
{x € B™ : \x\ = r} is the Euclidean metric, i.e. p m (x,y) = \x — y\,x,y S 

Let's recall a basic fact about e-net of the sphere: For < e < 2, there exists 
an e-net for (5 1 "" 1 , | ■ |) with cardinality < 2m(l + |) m_1 - Proof: Consider a 
maximal e-separated subset A of S 1 *™ -1 , then A is automatically an e-net. The |- 
balls centered at these points are disjoint and contained in (l + ^B^'Vl^i)-^™! 
by volume counting we get 

\a\ ■ C-r < a + |r - (i - f r < m(i + 1)—^ 

Hence |A| < 2m(l + f)— 1 . 

Let {x 1 ,--- ,x M } be an ae-nct for (5 TO_1 ,p m ), M < 2m(l + ^) m_1 ,0 < 
a < 1. Since < (1 — a)e < i, by (ii), we can find an e-net {771, • • • , 77^} for 
([0,1], Pl ) withiV< TT ^ 7 . Then 



a = 1 — ^ gives the desired bound. 



= {r/ia^' : 1 < i < AT, 1 < j < M} 

(l-a)e 1 



is an e-nct for (B 2 n ,p m ), and \ Jf\ < NM < 733^(1 + ^) m_1 - Choosing 
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To see JV is an e-net, pick x £ B™, we can find 77, with pi(|x|, r\i) < (1 — a)e, 
and x j with — x 3 \ < ae (If x = 0,replace i|r by any it € S"™ -1 ). Then 

Pm(x, r?i-p-r) = /Oi(|x|,77i) < (1 - a)e 
Fl 

pm(rii-ri^i xj ) = Vilj-r -x J \<ae 
\x\ \x\ 

By triangle inequality, we have p m (x,r]iX 3 ) < (1 — a)e + ae = e. □ 
Lemma 8.5. Assume I < m < n, for u £ B™ , define 

<S(u) = {x £ S" 1 ^ 1 : (xi, ■ ■ • ,x m ) = u} 

Let jY be an e-net for (B™ , p m ) where p m is defined in lemma 8.4, then 3/ = 
\J u ^jyS(u) is an e-net for (S 1 ™ -1 ,! • |). When m = 1, if jV is an e-net for 
([0,1], pi), then\J u ^^S(u) is an e-net for (S 1 ? - , | ■ |). 

Proof. For x £ 5 rl_1 , write x = (x',x"), where x' is the first m coordinates. 
Since JV is an e-net for (B™, p m ), we can find u £ JV with p m (x' , u) < e. Then 

y = (u, ^ Mm| V') g Siu) C 3C (If x" = 0,lct y = (u,0)) is such that 

y/l-\x'\ 2 



\x - y| = ,L - U | 2 + |z" - ^ = x"\ 2 = Pm{x',u) < e 

The proof for the m = 1, (<S_l _ , | • |) case is similar. □ 

Proposition 8.6. [6] Let Xi,--- ,X n be i.i.d. Af(0, 1) random variables, f : 
R n M is Lipschitz. Then 



P(f(X)>E{f(X)]+t)<e 2 "^, t>0 

Proposition 8.7. Let X\, ■ ■ ■ , X n be i.i.d. J\f(0, 1) random variables, Oi, • • • , a„ > 
0, then fort>{) 

U 1 2 

i=l 
n 

p£>(X 4 2 -1)< -|a|t)<e-V 

The proof is based on Chcrnoff's exponential method, see [18] page 1325. 

Proposition 8.8. [19] (McDiarmid's Inequality) Let Xi be an §>i-valued random 
variable, 1 < i < n, and assume X\,--- ,X n to be independent. Let f : Si x 
• • • x S n — > R 6e Sore/ measurable. Suppose that there exist positive constants 
ci, • • • , c n such that 

\f(x) - f(xi, ■ ■ ■ ,Xi-i,Xi,x i+ i, ■ ■ ■ ,x n )\ < Ci Vx,x'i 
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Then for t > 



P(f(X)>E[f(X)]+t)<e^°t 



Proposition 8.9. (Cauchy's Interlacing Law) Let A € , deleting the iQ-th 
row and i^-th colume of A, we get a matrix Ai . Then the eigenvalues of A and 
A l0 satisfy 

Xi{A) > \i(A io ) > X 2 (A) > X 2 (A io ) >■■■> \n-i(K) > ^n(A) 

Proposition 8.10. [17] Let G £ GOE(n, F n (x) = : Xj(G) < x}\. Let 

F(x) be the distribution function of the semicircle law with density 2 1 a V '4cr 2 — x 2 l\ x \<2adx. 

Then there exists a constant C such that 

sup \EF n (x) - F(x)\ < - 
iei ti 

Proposition 8.11. [14] Let G e W xn with entries being i.i.d. W(0,1), S n = 
■^GG* , F n (x) = : Xj(S n ) < x}\. Let F(x) be the distribution function of 

the law with density 



Pc(x) 



' V((l+^-*)(*-(l- V^I) if (1 _ ^2 < x < (1 + y5)2j 

otherwise. 



with c — p/n. Then there exists a constant C such that 

suv\EF n {x) - F{x)\ <C(- + -) 
x€R n p 
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